Utilizing Clues in Syntactic Relationship for Automatic Target Word Sense Disambiguation
نویسندگان
چکیده
Multiple translations to the target language are due to several meanings of source words and various target word equivalents, depending on the context of the source word. Thus, an automated approach is presented for resolving target-word selection, based on “word-to-sense” and “sense-to-word” relationship between source words and their translations, using syntactic relationships (subject-verb, verb-object, adjective-noun). Translation selection proceeds from sense disambiguation of source words based on knowledge from a bilingual dictionary and word similarity measures from WordNet, and selection of a target word using statistics from a target language corpus. Test results using English to Tagalog translations show an overall 64% accuracy for selecting word translation with a standardized precision of at least 80% for generating expected translations using 200 sentences with ambiguous words (an average of 4 senses) in three categories: nouns, verbs, and adjectives. This system is tested on 145,746 word pairs in syntactic relationships that are extracted from target corpora (with 317,113 words). Sense profile, with 2681 entries for source words, is built from an existing bilingual dictionary that includes clues for disambiguation and target translations. The results show an improvement on the performance of the method with the utilization of syntactic information in resolving targetword ambiguity. Further improvements include the integration of other content words and syntactic categories, the addition of reliable clues for sense disambiguation, and the integration of smoothing techniques. ____
منابع مشابه
Principled Disambiguation: Discriminating Adjective Senses with Modified Nouns
Recent corpus-based work on word sense disambiguation explores the application of statistical pattern recognition procedures to lexical co-occurrence data from very large text databases. In this paper we argue for a linguistically principled approach to disambiguation, in which relevant contextual clues are narrowly defined, in syntactic and semantic terms, and in which only highly reliable clu...
متن کاملAutomatic Target Word Disambiguation Using Syntactic Relationships
Multiple target translations are due to several meanings of source words, and various target word equivalents depending on the context of the source word. Thus, an automated approach is presented for resolving target-word selection, based on “word-to-sense” and “sense-to-word” source-translation relationships, using syntactic relationships (subject-verb, verb-object, adjectivenoun). Translation...
متن کاملVowel Sound Disambiguation for Intelligible Korean Speech Synthesis
For speech synthesis systems that transform text materials into voice data, correctness and naturalness are the crucial measures of performance, the latter gaining more emphasis recently. In order to make synthesized voices natural, we must take into account pronunciation-related linguistic phenomena such as homograph, among others. The syntax certainly provides an important clue to disambiguat...
متن کاملConstruction of Semantic Relations for Enhancing Word Sense Disambiguation in Question Answering Systems
Word sense disambiguation is a significant problem at the lexical level of natural language processing. The philosophy is to determine the meaning of a word in a particular usage, by using sense similarity and syntactic context with corpus evidence as well as semantic relations from WordNet. A training set will be constructed for each word tag (using the corpus). Each training example is repres...
متن کاملSense Discriminative Patterns for Word Sense Disambiguation
Given a target word wi to be disambiguated, we define a class of local contexts for wi such that the sense of wi is univocally determined. We call such local contexts sense discriminative and represent them with sense discriminative (SD) patterns of lexico-syntactic features. We describe an algorithm for the automatic acquisition of minimal SD patterns based on training data in SemCor. We have ...
متن کامل